A Re-interpretation of Jones, Harrold and Stasko Test Information Visualization (long Version)
نویسندگان
چکیده
The current trend in debugging and testing is to cross-check information collected during several executions. Jones et al., for example, propose to use the instruction coverage of passing and failing runs in order to visualize suspicious statements. This seems promising but lacks a formal justification. In this paper, we show that the method of Jones et al. can be re-interpreted as a data mining procedure. More particularly, the suspicion indicator they define can be rephrased in terms of well-known metrics of the data-mining domain. These metrics characterize association rules between data. With this formal framework we are able to explain limitations of the above indicator. Three significant hypotheses were implicit in the original work. Namely, 1) there exists at least one statement that can be considered as faulty ; 2) the values of the suspicion indicator for different statements should be independent from each others; 3) executing a faulty statement leads most of the time to a failure. We show that these hypotheses are hard to fulfill and that the link between the indicator and the correctness of a statement is not straightforward. The underlying idea of association rules is, nevertheless, still promising, and our conclusion emphasizes some possible tracks for improvement. Résumé : La tendance actuelle en débogage et test de programmes est de recouper des informa-tions rassemblées lors de plusieurs exécutions. Jones et al., par exemple, proposent d'employer la couverture d'instructions calculée pour des exécutions réussissant etéchouant afin de visualiser des instructions suspectes. Ceci semble prometteur mais il manque une justification formelle. Dans cet article, nous montrons que la méthode de Jones et al. peutêtre réinterprétée comme un procédé de fouille de données. Plusparticulì erement, l'indicateur de suspicion qu'ils définissent peutêtre refor-mulé en termes de métriques bien connues en fouille de données. Ces métriques caractérisent des r` egles d'association entre les données. Avec ce cadre formel nous pouvons expliquer des limitations de l'indicateur mentionné ci-dessus. Trois hypothèses significativesétaient implicites dans le travail original. A savoir, 1) il existe au moins une instruction qui peutêtre considérée comme défectueuse ; 2) les valeurs de l'indicateur de suspicion pour différentes instructions doiventêtre indépendantes les unes des autres ; 3) exécuter une instruction défectueuse conduit la plupart du tempsà unéchec. Nous prouvons qu'il est difficile de satisfaire ces hypothèses et que le lien entre l'indicateur et la correction d'une instruction n'est pas direct. L'idée fondamentale des r` egles d'association est, néanmoins, prometteuse, …
منابع مشابه
Data Mining and Cross-checking of Execution Traces A re-intepretation of Jones, Harrold and Stasko test information visualization
The current trend in debugging and testing is to cross-check information collected during several executions. Jones et al., for example, propose to use the instruction coverage of passing and failing runs in order to visualize suspicious statements. This seems promising but lacks a formal justification. In this paper, we show that the method of Jones et al. can be re-interpreted as a data minin...
متن کاملVisually Encoding Program Test Information to Find Faults in Software
Large test suites are frequently used to evaluate the correctness of software systems and to locate errors. Unfortunately, this process can generate a huge amount of data that is difficult to interpret manually. We have created a system called TARANTULA that visually encodes test data to help find program errors. The system uses a principled color mapping to represent how particular source line...
متن کاملTechnical Note: Visually Encoding Program Test Information to Find Faults in Software
Large test suites are frequently used to evaluate software systems and to locate errors. Unfortunately, this process can generate a huge amount of data that is difficult to interpret manually. We have created a system, TARANTULA, that visually encodes test data to help find program errors. The system uses a principled color mapping to represent source lines in passed and failed tests. It also p...
متن کاملTheories in Information Visualization: What, Why and How
In this paper we explore the perceived absence and greater need for theories in information visualization through the perspectives of what, why and how. We discuss five possible forms of theory: law, model, framework, taxonomy and interpretation (the “what”), the reasons and potential benefits of theories (the “why”) and possible ways to generate theories (the “how”).
متن کاملDagstuhl Seminar
Information Visualization (InfoVis) focuses on the use of visualization techniques to help people understand and analyze data. While related elds such as Scienti c Visualization involve the presentation of data that has some physical or geometric correspondence, Information Visualization centers on abstract information without such correspondences. The aim of this seminar was to bring together ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005